tests: fix parametrize patterns rejected by pytest 9.1.0#2212
Merged
Conversation
Two latent test-code bugs that older pytest tolerated but pytest 9.1.0
flags as collection errors, breaking every Test job on main since the
pytest 9.1.0 release:
* cuda_core/tests/test_utils.py:151 had a stray trailing comma in the
`parametrize` name string (`"in_arr,"`). pytest 9 now splits names on
comma and counts them, mismatching against the multi-element value
tuples. Drop the comma.
* cuda_bindings/tests/test_nvfatbin.py had two tests using
`@pytest.mark.parametrize("arch", ["sm_80"], indirect=True)` to
override the fixture-level `arch` parametrization. pytest 9 now
rejects this combination as "duplicate parametrization of 'arch'".
Extract the CUBIN-building logic into a `_build_cubin(arch)` helper,
drop the indirect override on the two tests, and call the helper
inline with the hardcoded `"sm_80"` they need. Preserves intent (the
override existed because target arch "75" must not match the CUBIN's
arch).
Both fixes are pytest-version-agnostic; verified collecting cleanly
under pytest 9.1.0, 9.0.2, and 8.4.2 with minimal reproductions of
each pattern.
Contributor
rwgk
approved these changes
Jun 14, 2026
rwgk
left a comment
Contributor
There was a problem hiding this comment.
I saw ... it can only get better!
rwgk
approved these changes
Jun 14, 2026
rwgk
left a comment
Contributor
There was a problem hiding this comment.
GTP-5.5:
No code findings from my review.
The two edits look correct and narrowly scoped:
cuda_core/tests/test_utils.py: fixes the stray@pytest.mark.parametrize("in_arr,", ...)name._cpu_array_samples()supplies one argument per case, soin_arris the intended single parameter name.cuda_bindings/tests/test_nvfatbin.py: extracts the oldCUBINfixture body into_build_cubin(arch), keeps the fixture behavior unchanged, and lets the two mismatch tests build onlysm_80without re-parametrizing the existingarchfixture.
Operationally, I would not call the PR merge-ready until full CI runs. Right now the visible checks only include path-label/restricted-path/metadata checks plus pre-commit.ci, and the copy-pr-bot comment says the PR still needs validation before NVIDIA runner workflows can run. Code-wise this looks ready to test; process-wise it still needs the full CI trigger and a green run before merging.
Contributor
|
/ok to test fadd5bd |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1 similar comment
|
This was referenced Jun 15, 2026
leofang
added a commit
that referenced
this pull request
Jun 16, 2026
Backport of #2212, scoped down to the cuda_bindings/tests/test_nvfatbin.py portion that applies to 12.9.x. The cuda_core/tests/test_utils.py portion of #2212 (the trailing-comma-in-parametrize-name fix) does not apply here because the 12.9.x version of that test file does not have the bug — its parametrize uses two names matching tuple values. What is fixed (verbatim from #2212): cuda_bindings/tests/test_nvfatbin.py had two tests using @pytest.mark.parametrize("arch", ["sm_80"], indirect=True) to override the fixture-level `arch` parametrization. pytest 9.1.0 now rejects this combination as "duplicate parametrization of 'arch'". Extract the CUBIN-building logic into a _build_cubin(arch) helper, drop the indirect override on the two tests, and call the helper inline with the hardcoded "sm_80" they need. Preserves intent (the override existed because target arch "75" must not match the CUBIN's arch). Closes #2226. Hunk body verified identical to the corresponding hunk in #2212 (commit a9156b6).
leofang
added a commit
to leofang/cuda-python
that referenced
this pull request
Jul 1, 2026
Two nightly failure fixups after the first green iteration: nightly-numba-cuda-mlir: numba-cuda-mlir 0.4.0 has an inverted guard that registers an overload of np.row_stack on NumPy 2.x, and NumPy 2.5 removed that name entirely, so test collection fails with "AttributeError: module 'numpy' has no attribute 'row_stack'". Cap numpy to <2.5. See NVIDIA/numba-cuda-mlir#154. nightly-cuda-core: released cuda-core v1.0.1's test suite uses a parametrize argvalues pattern that pytest 9.1 rejects ("in parametrize the number of names (1)... must be equal to the number of values (3)"). The main-side fix was NVIDIA#2212 but it has not shipped in a cuda-core release yet. Cap pytest to <9.1 for the released-cuda-core test run only.
leofang
added a commit
that referenced
this pull request
Jul 2, 2026
* CI: add nightly-cuda-core and nightly-numba-cuda-mlir modes nightly-cuda-core: test the released cuda-core from PyPI against main-built pathfinder and cuda-bindings, catching the "core released × bindings main" gap documented in issue #1955. Runs on linux-64 (a100) and win-64 (a100 MCDM). nightly-numba-cuda-mlir: MLIR-backend companion to nightly-numba-cuda. Installs main pathfinder+bindings+core plus numba-cuda-mlir from PyPI, runs numba-cuda-mlir's own test suite from the matching git tag. Linux amd64/arm64 x CUDA 12.9.1 / 13.3.0. Both modes fetch the released version's tests from git tags because the respective wheels do not ship test_*.py files. Includes tag-not-found fallback (log warning + exit 0) to avoid red-lining the nightly on a freshly-cut PyPI release that hasn't been pushed to git yet. * ci/test-matrix.yml: fix CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM typo The two ENV overrides intended to exercise the per-thread default stream code path were misspelled (missing the CUDA_ segment), so the env var was silently ignored and the PTDS coverage added in #1972 had no effect. Rename to the correct CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM. Refs #971. * cuda_pathfinder: pin nvshmem to <3.7 (was previously excluding only 3.7.0) nvidia-nvshmem-cu{12,13} 3.7.x breaks the main branch, not only 3.7.0. Widen the exclusion from an exact-version bump to <3.7 so 3.7.x and above are avoided until we can move forward. * nightly-numba-cuda-mlir: swap arm64 for win-64 coverage, use rtxpro6000 Drop the linux-aarch64 rows and instead add win-64 coverage with the same CUDA 12.9.1 / 13.3.0 pair. Switch all four rows from GPU l4 to rtxpro6000. Windows rows use DRIVER_MODE MCDM, matching the existing rtxpro6000 CUDA 13.3.0 patterns. * Temporarily add push trigger to ci-nightly.yml for testing Remove before merging. * CI: switch nightly-{cuda-core,numba-cuda-mlir} to actions/checkout for tests The initial approach used git inside the ubuntu:24.04 container to fetch the released version's test suite, but git is not installed on that container (install_unix_deps only pulls in jq/wget/g++/etc.) and its absence made the run steps silently skip via the tag-not-fetchable fallback. On Windows, git archive of just the cuda_core subtree also hit a dangling-symlink extraction failure (cuda_core/.git_archival.txt). Refactor to: - run-tests: just install wheels and expose the resolved release version (CUDA_CORE_RELEASED_VER / NUMBA_CUDA_MLIR_VER) and cuda-core test-group name via GITHUB_ENV. No more git operations. - test-wheel-{linux,windows}.yml: add an actions/checkout step per mode that pulls the matching release tag into a subdirectory (cuda-core-released / numba-cuda-mlir-released), then the follow-up test step installs that tag's test dep-group and runs pytest. For numba-cuda-mlir also pass --ignore=tests/benchmarks --ignore=tests/doc_examples to pytest: those directories import the `numba` package at module top and would fail collection, which is cuSIMT's expected behavior (see NVIDIA/numba-cuda-mlir#136 — cuSIMT intentionally does not depend on numba). * CI: pin numpy<2.5 (mlir) and pytest<9.1 (cuda-core released tests) Two nightly failure fixups after the first green iteration: nightly-numba-cuda-mlir: numba-cuda-mlir 0.4.0 has an inverted guard that registers an overload of np.row_stack on NumPy 2.x, and NumPy 2.5 removed that name entirely, so test collection fails with "AttributeError: module 'numpy' has no attribute 'row_stack'". Cap numpy to <2.5. See NVIDIA/numba-cuda-mlir#154. nightly-cuda-core: released cuda-core v1.0.1's test suite uses a parametrize argvalues pattern that pytest 9.1 rejects ("in parametrize the number of names (1)... must be equal to the number of values (3)"). The main-side fix was #2212 but it has not shipped in a cuda-core release yet. Cap pytest to <9.1 for the released-cuda-core test run only. * CI: deselect known pre-existing failures in nightly-cuda-core and nightly-numba-cuda-mlir Applied only in the affected nightly-* pytest invocations; the released source trees under test are unmodified. nightly-numba-cuda-mlir (all 10 tests deselected are from cuSIMT): * CudaArraySetting::{test_no_sync_default_stream, test_no_sync_supplied_stream, test_sync} TestCudaArrayInterface::{test_consume_no_sync, test_consume_sync, test_launch_no_sync, test_launch_sync, test_launch_sync_two_streams, test_fortran_contiguous} Serial-pytest contamination of numba_cuda_mlir.cuda.cudadrv from an xfailed test in test_nrt_comprehensive.py. Upstream CI runs with `pytest -n auto --dist loadscope`, which isolates the offending side effect in a separate xdist worker; our nightly runs serially and hits the pollution. See NVIDIA/numba-cuda-mlir#135. * TestLinkerDumpAssembly::test_nvjitlink_jit_with_linkable_code_lto_dump_assembly_warn Subprocess-invokes `cuobjdump`, which isn't on PATH in the base ubuntu:24.04 container. Filed as an upstream skip-guard bug. nightly-cuda-core (3 tests deselected are pre-existing v1.0.1 issues): * test_enum_coverage.py::test_wrapper_covers_all_binding_members[NvlinkVersion] Expected drift: main cuda-bindings adds NvlinkVersion.VERSION_6_0 which v1.0.1's wrapper mapping predates. This mode intentionally pairs released core with main bindings, so this coverage-style test will stay red here until a cuda-core release catches up. * test_rlcompleter_patch.py::test_opt_out_env_var_disables_patch_even_when_interactive Environment-dependent test: expects rlcompleter to crash without the tab-completion patch, but on Windows MCDM the pre-patch behavior is clean. Passes on Linux, fails on Windows MCDM. * test_memory.py::test_non_managed_resources_report_not_managed[pinned] Same underlying "Failed to allocate memory from pool" error that v1.0.1 already xfails in the sibling test_pinned_memory_resource_initialization (TODO(#9999)). cuda-python main has since fixed the parametrized case to route through _allocate_pinned_buffer_or_xfail(), but that fix hasn't shipped in a cuda-core release yet. * CI: tighten deselects to per-platform failing sets Previously applied the same list on both Linux and Windows workflows, which over-deselected — some tests only fail on one platform because the underlying issues (serial-pytest test-order in mlir, MCDM-only behavior in cuda-core) are platform-specific. Now: nightly-numba-cuda-mlir linux-64: TestCudaArrayInterface::{test_consume_no_sync, test_consume_sync, test_launch_no_sync, test_launch_sync, test_launch_sync_two_streams, test_fortran_contiguous} + TestLinkerDumpAssembly::test_nvjitlink_jit_with_linkable_code_lto_dump_assembly_warn. win-64: CudaArraySetting::{test_no_sync_default_stream, test_no_sync_supplied_stream, test_sync} + TestCudaArrayInterface::test_fortran_contiguous. Test-order contamination in numba-cuda-mlir#135 surfaces different tests depending on collection order (linux-64 vs win-64 exercise different subsets), so the per-platform lists differ. cuobjdump-based TestLinkerDumpAssembly only fires on Linux because the ubuntu:24.04 container's PATH lacks cuobjdump; Windows runners ship it with the local CTK. nightly-cuda-core linux-64: test_enum_coverage.py::test_wrapper_covers_all_binding_members[NvlinkVersion]. win-64: NvlinkVersion (same as Linux) + test_rlcompleter_patch.py::test_opt_out_env_var_disables_patch_even_when_interactive + test_memory.py::test_non_managed_resources_report_not_managed[pinned]. rlcompleter and pinned mempool tests only fail on Windows MCDM. NvlinkVersion fails on both (expected drift for the mode). * CI: version-gate the nightly-mode deselects so they auto-clean Each deselect is now wrapped in a bash conditional keyed on the installed release version. When a newer numba-cuda-mlir or cuda-core release ships with the referenced fix, the nightly picks it up automatically, the guard evaluates false, and the deselect drops — so the tests run against the new release. If they still fail we hear about it loudly rather than silently masking a regression. Current guards: - numba-cuda-mlir #135 tests + cuobjdump TestLinkerDumpAssembly: applied when installed numba-cuda-mlir version <= 0.4.0. - cuda-core NvlinkVersion / rlcompleter opt-out / pinned mempool: applied when installed cuda-core version <= 1.0.1. Structure keeps one conditional block per (mode, platform) with a comment above each deselect explaining the tracking issue. * CI: broaden mlir deselect list to full #135 union across platforms The previous per-platform-tight lists were incomplete: NVIDIA/numba-cuda-mlir#135's import-time contamination poisons whichever tests reference cuda.cudadrv.driver AFTER the polluting xfail runs, and collection order varies between runs. Two consecutive Windows CI runs failed on different subsets (3 slicing tests one run, 5 interface tests the next). Deselect the full union of #135-listed tests + test_fortran_contiguous (observed to hit the same contamination) on both Linux and Windows. Same version guard (<= 0.4.0) still applies, so the whole block drops automatically when a newer numba-cuda-mlir release ships with the root-cause fix. Linux keeps the extra cuobjdump deselect (Linux-only environment issue). * Revert "cuda_pathfinder: pin nvshmem to <3.7 (was previously excluding only 3.7.0)" This reverts commit 2a42aa7. * Revert "Temporarily add push trigger to ci-nightly.yml for testing" This reverts commit a0ccd19.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
main has been red since pytest 9.1.0 landed on PyPI — every
Test linux-*/Test win-*matrix entry fails at pytest collection time, before any actual test runs. Two unrelated latent bugs in our test code, both tolerated by older pytest but rejected by pytest 9.1.0's stricterparametrizevalidation:Bug 1: trailing comma in
parametrizename (cuda_core)cuda_core/tests/test_utils.py:151had:@pytest.mark.parametrize("in_arr,", _cpu_array_samples())The
,inside the string was a stray. pytest 9 splits names on comma, ends up with one name but 3-tuple values, and fails collection with:Fix: drop the trailing comma.
Bug 2:
indirect=Trueoverride of a fixture-level parametrize (cuda_bindings)cuda_bindings/tests/test_nvfatbin.pyhas anarchfixture parametrized withparams=ARCHITECTURES. Two tests overrode it via@pytest.mark.parametrize("arch", ["sm_80"], indirect=True). pytest 9 now rejects this as:Fix: extract the CUBIN-building logic from the
CUBINfixture into a_build_cubin(arch)helper, drop theindirectoverride on the two affected tests, and call the helper directly with"sm_80"(preserving the original intent — those tests intentionally used only sm_80, since target arch"75"must not match the CUBIN's arch).Backwards compatibility
Both fixes are pytest-version-agnostic — pip pin (
pytest>=6.2.4) doesn't need to change. Verified by collecting against three pytest versions (minimal repros, included below for reproducibility):Reference
Affected CI runs on main:
Same pattern on my open #2210: https://github.com/NVIDIA/cuda-python/actions/runs/27489049015 — 38
Run cuda.core testsfailures + 23Run cuda.bindings testsfailures all stem from these two collection errors.